Goto

Collaborating Authors

 computer science paper


An AI helps you summarize the latest in AI

#artificialintelligence

The news: A new AI model for summarizing scientific literature can now assist researchers in wading through and identifying the latest cutting-edge papers they want to read. On November 16, the Allen Institute for Artificial Intelligence (AI2) rolled out the model onto its flagship product, Semantic Scholar, an AI-powered scientific paper search engine. It provides a one-sentence tl;dr (too long; didn't read) summary under every computer science paper (for now) when users use the search function or go to an author's page. The work was also accepted to the Empirical Methods for Natural Language Processing conference this week. The context: In an era of information overload, using AI to summarize text has been a popular natural-language processing (NLP) problem.


Quantum scientists embrace machine learning to push research and application - Inside The Perimeter

#artificialintelligence

The last few years have seen an explosion of interest in quantum machine learning to accelerate scientific discovery in a range of fields, from quantum computing to the development of new materials and medicines. That effort deepened in July as researchers from industry and academia gathered for the week-long workshop "Machine Learning for Quantum Design" at Perimeter Institute. Conference co-organizer Roger Melko said the conference demonstrated the remarkable progress researchers have made in just a few years since the previous gathering of its kind at Perimeter. "We first had this conference on quantum machine learning three years ago, and it was largely blue-sky proposals and ideas back then," he said. "Now, the scientists here are actually implementing those ideas. The field is changing fast and the pace of that change is accelerating."


Looking Beyond Text: Extracting Figures, Tables and Captions from Computer Science Papers

AAAI Conferences

Identifying and extracting figures and tables along with their captions from scholarly articles is important both as a way of providing tools for article summarization, and as part of larger systems that seek to gain deeper, semantic understanding of these articles. While many "off-the-shelf" tools exist that can extract embedded images from these documents, e.g. PDFBox, Poppler, etc., these tools are unable to extract tables, captions, and figures composed of vector graphics. Our proposed approach analyzes the structure of individual pages of a document by detecting chunks of body text, and locates the areas wherein figures or tables could reside by reasoning about the empty regions within that text. This method can extract a wide variety of figures because it does not make strong assumptions about the format of the figures embedded in the document, as long as they can be differentiated from the main article's text. Our algorithm also demonstrates a caption-to-figure matching component that is effective even in cases where individual captions are adjacent to multiple figures. Our contribution also includes methods for leveraging particular consistency and formatting assumptions to identify titles, body text and captions within each article. We introduce a new dataset of 150 computer science papers along with ground truth labels for the locations of the figures, tables and captions within them. Our algorithm achieves 96% precision at 92% recall when tested against this dataset, surpassing previous state of the art. We release our dataset, code, and evaluation scripts on our project website for enabling future research.